Online Imbalanced Support Vector Machine for Phishing Emails Filtering

نویسندگان

  • XiaoQing Gu
  • TongGuang Ni
  • Wei Wang
چکیده

Phishing emails are a real threat to internet communication and web economy. In real-world emails datasets, data are predominately composed of ham samples with only a small percentage of phishing ones. Standard Support Vector Machine (SVM) could produce suboptimal results in filtering phishing emails, and it often requires much time to perform the classification for large data sets. In this paper, an online version of imbalanced SVM (OISVM) is proposed. First an email is converted into 20 features which are well selected based on its content and link characters. Second, OISVM is developed to optimize the classification accuracy and reduce computation time, which is used a novel method to adjust the separation hyperplane of imbalanced date sets and an online algorithm to make the retaining process much fast. Compared to the existing methods, the experimental results show that OISVM can achieve significantly using a proposed expressive evaluation method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spam Sender Detection with Classification Modeling on Highly Imbalanced Mail Server Behavior Data

Unsolicited commercial or bulk emails or emails containing viruses pose a great threat to the utility of email communications. A recent solution for filtering is reputation systems that can assign a value of trust to each IP address sending email messages. By analyzing the query patterns of each node utilizing reputation information, reputation systems can calculate a reputation score for each ...

متن کامل

Improved Phishing Detection using Model-Based Features

Phishing emails are a real threat to internet communication and web economy. Criminals are trying to convince unsuspecting online users to reveal passwords, account numbers, social security numbers or other personal information. Filtering approaches using blacklists are not completely effective as about every minute a new phishing scam is created. We investigate the statistical filtering of phi...

متن کامل

MDMap: Assisting Users in Identifying Phishing Emails

Email-based online phishing is one of the key security threats that greatly deteriorate the trustworthiness of the Internet. Although many spam filters have been developed and deployed, a non-negligible number of phishing emails still sneak into users’ inboxes each day. Phishing emails often contain suspicious information that separate them from the legitimate ones; however, average non-expert ...

متن کامل

Unweaving the Phisher's Net: An Exploratory Study

Over 29,000 phishing emails are reported each month on average to the AntiPhishing Working Group. If we consider that at least 5% of these emails achieve their target, at least 1,450 distinct email users a month are caught in the phisher’s net. This study attempts to understand the basic deception techniques utilized by phishers when creating the phishing emails. Exploratory content and linguis...

متن کامل

Online Voltage Stability Monitoring and Prediction by Using Support Vector Machine Considering Overcurrent Protection for Transmission Lines

In this paper, a novel method is proposed to monitor the power system voltage stability using Support Vector Machine (SVM) by implementing real-time data received from the Wide Area Measurement System (WAMS). In this study, the effects of the protection schemes on the voltage magnitude of the buses are considered while they have not been investigated in previous researches. Considering overcurr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014